TripleTen - Sprint_06_Development_Tools_Project - Dan Lieberman¶
Contents ¶
Introduction ¶
This project is designed to give us more practice with common software engineering tasks, which will help us enhance and complement our data skills.
We were given the challenge of not only completing typical analysis tasks but also developing and deploying a web application to a cloud service for public access.
To support us in this endeavor, we were given a dataset containing car sales advertisements.
Goal:¶
The project is based on 4 main steps:
- Examine the methods for filling missing values in the 'cylinders' field.
- Create a couple of histograms and scatterplots
- Develop a web application dashboard.
- Deploy the application to Render.
Stages¶
Data on car sales advertisements is stored in the file vehicles_us.csv.
First, we'll create the following accounts for the project's Virtual Environment: GitHub, Render, VS Code, along side with the necessary packages installation.
Second, we'll Download the data file.
Third, we'll evaluate the quality of the data and see whether its issues are significant.
--
The project will consist of the following stages:
- Data overview.
- Data preprocessing.
- Exploratory Data Analysis - EDA.
- Develop a web application, and Deploy it to Render.
Stage 1. Data overview ¶
import pandas as pd
df = pd.read_csv('../vehicles_us.csv')
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 51525 entries, 0 to 51524 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 price 51525 non-null int64 1 model_year 47906 non-null float64 2 model 51525 non-null object 3 condition 51525 non-null object 4 cylinders 46265 non-null float64 5 fuel 51525 non-null object 6 odometer 43633 non-null float64 7 transmission 51525 non-null object 8 type 51525 non-null object 9 paint_color 42258 non-null object 10 is_4wd 25572 non-null float64 11 date_posted 51525 non-null object 12 days_listed 51525 non-null int64 dtypes: float64(4), int64(2), object(7) memory usage: 5.1+ MB
df.head(10)
| price | model_year | model | condition | cylinders | fuel | odometer | transmission | type | paint_color | is_4wd | date_posted | days_listed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 9400 | 2011.0 | bmw x5 | good | 6.0 | gas | 145000.0 | automatic | SUV | NaN | 1.0 | 2018-06-23 | 19 |
| 1 | 25500 | NaN | ford f-150 | good | 6.0 | gas | 88705.0 | automatic | pickup | white | 1.0 | 2018-10-19 | 50 |
| 2 | 5500 | 2013.0 | hyundai sonata | like new | 4.0 | gas | 110000.0 | automatic | sedan | red | NaN | 2019-02-07 | 79 |
| 3 | 1500 | 2003.0 | ford f-150 | fair | 8.0 | gas | NaN | automatic | pickup | NaN | NaN | 2019-03-22 | 9 |
| 4 | 14900 | 2017.0 | chrysler 200 | excellent | 4.0 | gas | 80903.0 | automatic | sedan | black | NaN | 2019-04-02 | 28 |
| 5 | 14990 | 2014.0 | chrysler 300 | excellent | 6.0 | gas | 57954.0 | automatic | sedan | black | 1.0 | 2018-06-20 | 15 |
| 6 | 12990 | 2015.0 | toyota camry | excellent | 4.0 | gas | 79212.0 | automatic | sedan | white | NaN | 2018-12-27 | 73 |
| 7 | 15990 | 2013.0 | honda pilot | excellent | 6.0 | gas | 109473.0 | automatic | SUV | black | 1.0 | 2019-01-07 | 68 |
| 8 | 11500 | 2012.0 | kia sorento | excellent | 4.0 | gas | 104174.0 | automatic | SUV | NaN | 1.0 | 2018-07-16 | 19 |
| 9 | 9200 | 2008.0 | honda pilot | excellent | NaN | gas | 147191.0 | automatic | SUV | blue | 1.0 | 2019-02-15 | 17 |
df.tail(5)
| price | model_year | model | condition | cylinders | fuel | odometer | transmission | type | paint_color | is_4wd | date_posted | days_listed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 51520 | 9249 | 2013.0 | nissan maxima | like new | 6.0 | gas | 88136.0 | automatic | sedan | black | NaN | 2018-10-03 | 37 |
| 51521 | 2700 | 2002.0 | honda civic | salvage | 4.0 | gas | 181500.0 | automatic | sedan | white | NaN | 2018-11-14 | 22 |
| 51522 | 3950 | 2009.0 | hyundai sonata | excellent | 4.0 | gas | 128000.0 | automatic | sedan | blue | NaN | 2018-11-15 | 32 |
| 51523 | 7455 | 2013.0 | toyota corolla | good | 4.0 | gas | 139573.0 | automatic | sedan | black | NaN | 2018-07-02 | 71 |
| 51524 | 6300 | 2014.0 | nissan altima | good | 4.0 | gas | NaN | automatic | sedan | NaN | NaN | 2018-06-05 | 10 |
df.isnull().sum()
price 0 model_year 3619 model 0 condition 0 cylinders 5260 fuel 0 odometer 7892 transmission 0 type 0 paint_color 9267 is_4wd 25953 date_posted 0 days_listed 0 dtype: int64
conclusion ¶
In this stage, we conducted an initial overview of the car sales advertisements dataset, which contains 51,525 entries and 13 columns. Key findings include:
Data Structure: The dataset includes attributes like price, model_year, model, condition, cylinders, fuel, odometer, transmission, type, paint_color, is_4wd, date_posted, and days_listed.
Missing Values: Several columns have missing values: model_year (3,619), cylinders (5,260), odometer (7,892), paint_color (9,267), and is_4wd (25,953).
Data Types: The data types are appropriate for each column
Stage 2. Data preprocessing ¶
2.1 Missing values ¶
df['paint_color'] = df['paint_color'].fillna('unknown')
df['is_4wd'] = df['is_4wd'].fillna(0)
df['is_4wd'] = df['is_4wd'] == 1
df['odometer'] = df['odometer'].fillna(0)
2.2 Removing records with NaN model year ¶
Although the model year is a crucial data point in advertising, yet some records lack this information.
Instead of imputing missing model years, I chose to remove these records to maintain the integrity of my analysis.
However, further analysis and accuracy could benefit from additional data containing this information.
# Remove the records of the cars with model_year is Nan
df = df[df['model_year'].notna()]
# Changing the model_year type to int
df['model_year'] = df['model_year'].astype('int')
df.isnull().sum()
price 0 model_year 0 model 0 condition 0 cylinders 4897 fuel 0 odometer 0 transmission 0 type 0 paint_color 0 is_4wd 0 date_posted 0 days_listed 0 dtype: int64
2.3 The cylinder field ¶
As evident from the list of missing values, we still have gaps in the 'cylinders' field.
We plan to address this issue later, as it requires careful consideration and may impact the accuracy of our analysis.
2.4 Duplicates ¶
Obvious Duplicates¶
# counting clear duplicates
print('Number of overall duplicates:', df.duplicated().sum())
Number of overall duplicates: 0
Implicit Duplicates¶
def print_sorted_unique_values(df, columns):
for column in columns:
unique_values = df[column].astype(str).unique()
unique_values.sort()
print(f"Unique values for {column}: {unique_values}")
# columns to search in
columns_to_check = ['condition', 'fuel', 'transmission', 'type', 'paint_color']
# Call the function
print_sorted_unique_values(df, columns_to_check)
Unique values for condition: ['excellent' 'fair' 'good' 'like new' 'new' 'salvage'] Unique values for fuel: ['diesel' 'electric' 'gas' 'hybrid' 'other'] Unique values for transmission: ['automatic' 'manual' 'other'] Unique values for type: ['SUV' 'bus' 'convertible' 'coupe' 'hatchback' 'mini-van' 'offroad' 'other' 'pickup' 'sedan' 'truck' 'van' 'wagon'] Unique values for paint_color: ['black' 'blue' 'brown' 'custom' 'green' 'grey' 'orange' 'purple' 'red' 'silver' 'unknown' 'white' 'yellow']
conclusion ¶
In conclusion, the data preprocessing stage involved several key steps to prepare the dataset for analysis:
Handling Missing Values:¶
Missing values in the 'paint_color', 'is_4wd', and 'odometer' columns were filled or replaced appropriately. Records with missing model years were removed from the dataset to maintain data integrity.
Cylinder Field:¶
The 'cylinders' field still contains missing values, which will be addressed in subsequent stages due to its potential impact on analysis accuracy.
Duplicates:¶
Both obvious and implicit duplicates were checked and appropriately handled.
Further analysis and model development will benefit from these preprocessing steps, ensuring that the dataset is ready for detailed exploration and modeling.
Stage 3. Exploratory Data Analysis - EDA ¶
3.1 Cylinders column ¶
In the following section, we will delve into the 'cylinders' column.
While this examination may not align with traditional exploratory data analysis (EDA) procedures, I have chosen to include it in this section due to its relevance in understanding the dataset.
This analysis has also led to valuable insights for further testing and analysis
df['cylinders'].unique()
array([ 6., 4., 8., nan, 5., 10., 3., 12.])
# Group the data by 'model' and 'model_year'
grouped = df.groupby(['model', 'model_year'])
# Calculate the median number of cylinders for each group
median_cylinders = grouped['cylinders'].median()
median_cylinders.unique()
array([ 6. , 8. , 7. , nan, 4. , 5. , 4.5, 10. , 9. , 5.5])
Upon reviewing the results, it appears that using the median function may not be the most appropriate approach.
The issue arises when there is an even number of candidates, as the median function returns their mean value.
A potentially better alternative is to determine the most common value in the group, which aligns more closely with the intended purpose of the median function but with greater accuracy, given the size of the dataset.
Please note that I used 'groupby' and 'median' functions to calculate the median number of cylinders for each group.
grouped = df.groupby(['model', 'model_year'])
grouped['cylinders'].unique()
model model_year
acura tl 1999 [6.0, nan]
2001 [6.0]
2002 [6.0]
2003 [6.0, nan]
2004 [6.0, 4.0, nan]
...
volkswagen passat 2014 [4.0, nan, 5.0]
2015 [4.0, nan, 5.0]
2016 [4.0, nan, 6.0]
2017 [4.0, nan]
2018 [4.0]
Name: cylinders, Length: 2226, dtype: object
Using of the 'Most Common' approach¶
# Group the data by 'model' and 'model_year'
grouped = df.groupby(['model', 'model_year'])
# Calculate the most common number of cylinders for each group
most_common_cylinders = grouped['cylinders'].apply(lambda x: x.mode()[0] if len(x.mode()) > 0 else None)
most_common_cylinders.unique()
array([ 6., 8., nan, 4., 5., 10.])
It's evident that the numbers 3 and 12 are missing from the new list. This discrepancy may be attributed to errors in the records corresponding to these values, although it could also be related to sub-model issues.
Therefore, I opted to correct only the NaN values and not the records associated with 3 and 12.
Additionally, some records have missing values for the 'cylinders' field, indicated by 'None'.
# Fill NaN values in 'cylinders' column with the calculated must common values
common_cylinders_series = df.set_index(['model', 'model_year']).index.map(most_common_cylinders.get)
df['cylinders'] = df['cylinders'].fillna(pd.Series(common_cylinders_series, index=df.index))
df['cylinders'].unique()
array([ 6., 4., 8., 5., 10., nan, 3., 12.])
Reviewing the NaNs¶
df[df['cylinders'].isnull()].sort_values(["model","model_year"])
| price | model_year | model | condition | cylinders | fuel | odometer | transmission | type | paint_color | is_4wd | date_posted | days_listed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11087 | 16000 | 1971 | chevrolet camaro | excellent | NaN | gas | 0.0 | manual | coupe | brown | False | 2018-12-28 | 125 |
| 28799 | 45900 | 1971 | chevrolet camaro | like new | NaN | gas | 0.0 | manual | coupe | orange | False | 2019-01-10 | 75 |
| 37000 | 6800 | 1980 | chevrolet malibu | good | NaN | gas | 0.0 | automatic | coupe | brown | False | 2019-02-26 | 33 |
| 3914 | 9750 | 2007 | chevrolet silverado 1500 crew | excellent | NaN | gas | 133000.0 | automatic | pickup | grey | False | 2018-08-22 | 55 |
| 36582 | 44900 | 1949 | chevrolet suburban | good | NaN | gas | 1800.0 | automatic | wagon | orange | False | 2018-08-19 | 10 |
| 6982 | 3900 | 1977 | chevrolet suburban | fair | NaN | gas | 0.0 | automatic | SUV | custom | False | 2019-02-02 | 71 |
| 15623 | 5300 | 2005 | dodge charger | excellent | NaN | gas | 125000.0 | automatic | sedan | red | False | 2018-10-30 | 46 |
| 33965 | 5800 | 1971 | ford econoline | good | NaN | gas | 78000.0 | automatic | van | blue | False | 2019-04-01 | 40 |
| 14752 | 15000 | 1954 | ford f-150 | excellent | NaN | gas | 3565.0 | manual | pickup | black | False | 2019-02-16 | 13 |
| 1101 | 9200 | 1975 | ford f-150 | excellent | NaN | gas | 0.0 | automatic | truck | green | False | 2018-08-28 | 40 |
| 15948 | 14999 | 2007 | ford f-250 sd | good | NaN | diesel | 237726.0 | automatic | truck | custom | True | 2018-11-21 | 46 |
| 37146 | 2000 | 1994 | ford f-250 super duty | fair | NaN | diesel | 224000.0 | automatic | truck | white | False | 2018-07-13 | 11 |
| 3139 | 3900 | 2003 | ford f-350 sd | good | NaN | gas | 0.0 | automatic | truck | black | True | 2018-06-20 | 59 |
| 6932 | 4000 | 1993 | ford f150 | excellent | NaN | gas | 0.0 | automatic | truck | unknown | True | 2019-01-28 | 18 |
| 5870 | 2500 | 2003 | ford f150 supercrew cab xlt | good | NaN | gas | 169000.0 | automatic | truck | green | True | 2019-01-17 | 18 |
| 7113 | 84900 | 2019 | ford f350 | new | NaN | diesel | 9000.0 | automatic | pickup | white | True | 2019-03-26 | 7 |
| 47761 | 3800 | 1984 | honda accord | new | NaN | gas | 121000.0 | manual | sedan | blue | False | 2019-01-05 | 162 |
| 7121 | 6900 | 2009 | jeep cherokee | excellent | NaN | gas | 130023.0 | automatic | SUV | black | True | 2018-09-03 | 19 |
| 43595 | 5995 | 1988 | jeep wrangler | good | NaN | gas | 100309.0 | manual | SUV | white | True | 2018-08-17 | 12 |
| 36935 | 3995 | 2009 | kia sorento | excellent | NaN | gas | 211698.0 | automatic | SUV | grey | True | 2018-10-08 | 56 |
| 3844 | 1900 | 1986 | nissan sentra | excellent | NaN | gas | 0.0 | manual | coupe | blue | False | 2018-10-29 | 63 |
| 35818 | 1700 | 1986 | nissan sentra | excellent | NaN | gas | 152000.0 | manual | coupe | blue | False | 2018-09-29 | 23 |
| 27075 | 3800 | 1988 | ram 2500 | good | NaN | gas | 64000.0 | automatic | pickup | unknown | True | 2018-08-26 | 34 |
| 29822 | 1800 | 1989 | toyota camry | excellent | NaN | gas | 196617.0 | automatic | sedan | blue | False | 2019-03-01 | 32 |
| 39952 | 1900 | 1990 | toyota camry | good | NaN | gas | 156481.0 | automatic | sedan | unknown | False | 2018-09-29 | 43 |
| 12760 | 1050 | 1995 | toyota corolla | fair | NaN | gas | 428000.0 | manual | sedan | black | False | 2018-06-13 | 12 |
df['cylinders'].isnull().sum()
26
# checking for a unique NaN
df1 = df[df["model"] == "toyota camry"]
df1[df1["model_year"] == 1989]
| price | model_year | model | condition | cylinders | fuel | odometer | transmission | type | paint_color | is_4wd | date_posted | days_listed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 29822 | 1800 | 1989 | toyota camry | excellent | NaN | gas | 196617.0 | automatic | sedan | blue | False | 2019-03-01 | 32 |
To conduct a more precise examination of the NaN values, we could have compared them against a range of years from the same model.
However, given that the number of NaN values is small relative to the overall total, I chose not to focus extensively on this aspect.
Checking the 'Cylinders' data after filling NaN values¶
df.head(15)
| price | model_year | model | condition | cylinders | fuel | odometer | transmission | type | paint_color | is_4wd | date_posted | days_listed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 9400 | 2011 | bmw x5 | good | 6.0 | gas | 145000.0 | automatic | SUV | unknown | True | 2018-06-23 | 19 |
| 2 | 5500 | 2013 | hyundai sonata | like new | 4.0 | gas | 110000.0 | automatic | sedan | red | False | 2019-02-07 | 79 |
| 3 | 1500 | 2003 | ford f-150 | fair | 8.0 | gas | 0.0 | automatic | pickup | unknown | False | 2019-03-22 | 9 |
| 4 | 14900 | 2017 | chrysler 200 | excellent | 4.0 | gas | 80903.0 | automatic | sedan | black | False | 2019-04-02 | 28 |
| 5 | 14990 | 2014 | chrysler 300 | excellent | 6.0 | gas | 57954.0 | automatic | sedan | black | True | 2018-06-20 | 15 |
| 6 | 12990 | 2015 | toyota camry | excellent | 4.0 | gas | 79212.0 | automatic | sedan | white | False | 2018-12-27 | 73 |
| 7 | 15990 | 2013 | honda pilot | excellent | 6.0 | gas | 109473.0 | automatic | SUV | black | True | 2019-01-07 | 68 |
| 8 | 11500 | 2012 | kia sorento | excellent | 4.0 | gas | 104174.0 | automatic | SUV | unknown | True | 2018-07-16 | 19 |
| 9 | 9200 | 2008 | honda pilot | excellent | 6.0 | gas | 147191.0 | automatic | SUV | blue | True | 2019-02-15 | 17 |
| 10 | 19500 | 2011 | chevrolet silverado 1500 | excellent | 8.0 | gas | 128413.0 | automatic | pickup | black | True | 2018-09-17 | 38 |
| 11 | 8990 | 2012 | honda accord | excellent | 4.0 | gas | 111142.0 | automatic | sedan | grey | False | 2019-03-28 | 29 |
| 12 | 18990 | 2012 | ram 1500 | excellent | 8.0 | gas | 140742.0 | automatic | pickup | unknown | True | 2019-04-02 | 37 |
| 13 | 16500 | 2018 | hyundai sonata | excellent | 4.0 | gas | 22104.0 | automatic | sedan | silver | False | 2019-01-14 | 29 |
| 14 | 12990 | 2009 | gmc yukon | excellent | 8.0 | gas | 132285.0 | automatic | SUV | black | True | 2019-01-31 | 24 |
| 15 | 17990 | 2013 | ram 1500 | excellent | 8.0 | gas | 0.0 | automatic | pickup | red | True | 2018-05-15 | 111 |
df['cylinders'].unique()
array([ 6., 4., 8., 5., 10., nan, 3., 12.])
cylinders_counts = df['cylinders'].value_counts()
print("Count of each unique value in 'cylinders' column:")
print(cylinders_counts)
Count of each unique value in 'cylinders' column: cylinders 8.0 16339 6.0 16330 4.0 14368 10.0 538 5.0 270 3.0 33 12.0 2 Name: count, dtype: int64
Conclusion ¶
The analysis of the 'cylinders' column revealed several key findings.
Firstly, the dataset contains records where the 'cylinders' field is missing or marked as 'None', indicating a data quality issue.
Secondly, the use of the median function to impute missing values may not be ideal, as it can introduce inaccuracies when dealing with an even number of candidates.
Instead, considering the most common value in each group could provide a more accurate representation of the data.
Finally, the absence of the numbers 3 and 12 in the updated list warrants further investigation, as it may indicate errors or sub-model issues in the dataset.
3.2 Manufacturer of 4WD cars by Year - Histograms ¶
This section examines the distribution of 4WD cars manufactured over the years using histograms.
We add a 'manufacturer' column to the dataset, assuming the first word of the model name indicates the manufacturer.
The analysis focuses on 4WD cars from 2009 onwards, showing the number of cars per manufacturer per year.
We also look specifically at 4WD cars in 2018, filtering for models with at least 20 cars, to visualize the distribution per model.
# Adding the manufacturer column. Assuiming the first word of the model is that.
df['manufacturer'] = df['model'].str.split(' ').str[0]
import plotly.express as px
year_df = df[df['model_year'] >= 2009]
df_count = year_df[year_df['is_4wd']].groupby(['manufacturer', 'model_year']).size().reset_index(name='Number of Cars')
fig = px.bar(df_count, x='model_year', y='Number of Cars', color='manufacturer', barmode='group', title='Amount of 4WD Cars per Year per Manufacturer')
fig.update_layout(xaxis_title="Model Year", xaxis={'tickvals': df_count['model_year']})
fig.show()
The histogram above indicates a notable shift in 4WD vehicle production in 2018.
Ford, which had been the dominant manufacturer from 2010 to 2018, was surpassed by Chevy and lost its leading position.
The shift in 4WD vehicle production in 2018, with Chevy surpassing Ford, prompted a closer examination of the models produced that year.
The histogram below showcases these models.
year_df = df[df['model_year'] == 2018]
df_count = year_df[year_df['is_4wd']].groupby(['model', 'model_year']).size().reset_index(name='Number of cars')
df_count = df_count[df_count['Number of cars'] >= 20]
fig = px.bar(df_count, x='model_year', y='Number of cars', color='model', barmode='group', title='Amount of 4WD Cars in 2018 per Model (filtered for >= 20)')
fig.show()
Conclusion ¶
The analysis of 4WD vehicle production revealed interesting trends.
The histogram showed a significant change in 2018, with Chevrolet overtaking Ford as the leading manufacturer.
This shift prompted a focused examination of the models produced in 2018, highlighting the dynamic nature of the automotive industry.
The findings underscore the importance of monitoring production trends to understand market dynamics and consumer preferences.
3.3 Price Vs Model Year - Scatter Plot ¶
In this section, we investigate the correlation between vehicle price and model year through scatter plots.
This analysis aims to uncover trends or patterns in price fluctuations across different model years, providing insights into vehicle depreciation rates and market values over time.
Additionally, we aim to identify any anomalies in these patterns.
df_filtered = df[df['price'] >= 100]
fig = px.scatter(df_filtered, x='model_year', y='price',
hover_name='model', title='Price vs Model Year',
labels={'model_year': 'Model Year', 'price': 'Price'})
# Update layout
fig.update_layout(showlegend=False)
# Show the chart
fig.show()
Conclusion ¶
The scatterplot reveals a notable upward trend in prices from 1996 onwards, indicating that newer vehicles tend to have higher prices. However, between the late 1970s and 1990s, the graph shows relative price stability, suggesting the presence of influential factors that require further analysis.
One of the most intriguing observations in the scatter plot is the presence of a dense cloud of data points, indicating inconsistency and numerous exceptions, particularly noticeable between the 1960s and the mid-late 1970s. This phenomenon is likely due to many cars from this era being preserved as collector's items, including 'muscle' cars that were discontinued around the time of the oil crisis. Prices of these vehicles have been well maintained and have even doubled, tripled, and enen more in value compared to relatively newer vehicles. This lack of a clear trend suggests that prices for these vehicles are highly variable, with sellers able to demand and receive prices based on the condition and rarity of the vehicle.
Stage 4. Develop a Web Application, and Deploy it to Render ¶
Saving the fixed CSV file for the application.¶
df.to_csv('../vehicles_us_fixed.csv')
The final part of the project has been completed and is now being implemented in the app.py file and on the specified platforms.
Final Conclusions ¶
The analysis of the car sales advertisements dataset has provided valuable insights into the trends and characteristics of the vehicles listed.
Here are the key findings and conclusions from the analysis:
Data Overview: The dataset contains 51,525 entries and 13 columns, including attributes such as price, model_year, model, condition, cylinders, fuel, odometer, transmission, type, paint_color, is_4wd, date_posted, and days_listed.
Missing Values: Several columns have missing values, including model_year, cylinders, odometer, paint_color, and is_4wd. These missing values were identified and may need to be addressed in further analysis to ensure the accuracy and completeness of the dataset.
Data Types: The data types for each column are appropriate, ensuring that the dataset is structured correctly for analysis.
Price vs. Model Year: A scatter plot analysis revealed an upward trend in prices from 1996 onwards, with newer vehicles generally commanding higher prices. However, between the late 1970s and 1990s, the graph showed relative price stability, indicating the presence of influencing factors that require further analysis.
Manufacturer of 4WD Cars by Year: Analysis of 4WD vehicle production showed a significant shift in 2018, with Chevrolet overtaking Ford as the leading manufacturer. This shift prompted a focused examination of the models produced in 2018, revealing interesting trends in vehicle production.
Overall, this analysis provides a comprehensive overview of the car sales advertisements dataset and highlights key trends and patterns in the data. Further analysis and exploration of the dataset could provide additional insights into the factors influencing vehicle prices, production trends, and market dynamics.